An approach to online Bayesian learning from multiple data streams

نویسندگان

  • R. Chen
  • K. Sivakumar
  • H. Kargupta
چکیده

We present a collective approach to mine Bayesian networks from distributed heterogenous web-log data streams. In this approach we first learn a local Bayesian network at each site using the local data. Then each site identifies the observations that are most likely to be evidence of coupling between local and non-local variables and transmits a subset of these observations to a central site. Another Bayesian network is learnt at the central site using the data transmitted from the local site. The local and central Bayesian networks are combined to obtain a collective Bayesian network, that models the entire data. This technique is then suitably adapted to an online Bayesian learning technique, where the network parameters are updated sequentially based on new data from multiple streams. We applied this technique to mine multiple data streams where data centralization is difficult because of large response time and scalability issues. This approach is particularly suitable for mining applications with distributed sources of data streams in an environment with non-zero communication cost (e.g. wireless networks). Experimental results and theoretical justification that demonstrate the feasibility of our approach are presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Empirical Comparison of Bayesian Network Parameter Learning Algorithms for Continuous Data Streams

We compare three approaches to learning numerical parameters of Bayesian networks from continuous data streams: (1) the EM algorithm applied to all data, (2) the EM algorithm applied to data increments, and (3) the online EM algorithm. Our results show that learning from all data at each step, whenever feasible, leads to the highest parameter accuracy and model classification accuracy. When fac...

متن کامل

Learning from Data Streams with Concept Drift

Increasing access to incredibly large, nonstationary datasets and corresponding demands to analyse these data has led to the development of new online algorithms for performing machine learning on data streams. An important feature of real-world data streams is " concept drift, " whereby the distributions underlying the data can change arbitrarily over time. The presence of concept drift in a d...

متن کامل

Cost Sensitive Online Multiple Kernel Classification

Learning from data streams has been an important open research problem in the era of big data analytics. This paper investigates supervised machine learning techniques for mining data streams with application to online anomaly detection. Unlike conventional machine learning tasks, machine learning from data streams for online anomaly detection has several challenges: (i) data arriving sequentia...

متن کامل

Dynamic Programming for Bayesian Logistic Regression Learning under Concept Drift

A data stream is an ordered sequence of training instances arriving at a rate that does not permit to permanently store them in memory and leads to the necessity of online learning methods when trying to predict some hidden target variable. In addition, concept drift often occurs, what means means that the statistical properties of the target variable may change over time. In this paper, we pre...

متن کامل

The Effect of Using Online Metacognitive Strategies Practice on EFL Learners’ Vocabulary Achievement: A Blended Approach

This study investigated a new blended approach for enhancing vocabulary achievement. To this end, the project used a convenience sample of 50 intermediate EFL learners ranging from 19-35 years of age from three intact classes studying English translation at university. They were randomly assigned to two groups of 25 students each. In the experimental group, the treatment consisted of providing ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001